nlp_architect.models.transformers package

Submodules

nlp_architect.models.transformers.base_model module

class nlp_architect.models.transformers.base_model.InputFeatures(input_ids, input_mask, segment_ids, label_id=None, valid_ids=None)[source]

Bases: object

A single set of features of data.

class nlp_architect.models.transformers.base_model.TransformerBase(model_type: str, model_name_or_path: str, labels: List[str] = None, num_labels: int = None, config_name=None, tokenizer_name=None, do_lower_case=False, output_path=None, device='cpu', n_gpus=0)[source]

Bases: nlp_architect.models.TrainableModel

Transformers base model (for working with pytorch-transformers models)

MODEL_CONFIGURATIONS = {'bert': (<class 'pytorch_transformers.modeling_bert.BertConfig'>, <class 'pytorch_transformers.tokenization_bert.BertTokenizer'>), 'quant_bert': (<class 'nlp_architect.models.transformers.quantized_bert.QuantizedBertConfig'>, <class 'pytorch_transformers.tokenization_bert.BertTokenizer'>), 'xlm': (<class 'pytorch_transformers.modeling_xlm.XLMConfig'>, <class 'pytorch_transformers.tokenization_xlm.XLMTokenizer'>), 'xlnet': (<class 'pytorch_transformers.modeling_xlnet.XLNetConfig'>, <class 'pytorch_transformers.tokenization_xlnet.XLNetTokenizer'>)}
evaluate_predictions(logits, label_ids)[source]
get_logits(batch)[source]

get model logits from given input

static get_train_steps_epochs(max_steps: int, num_train_epochs: int, gradient_accumulation_steps: int, num_samples: int)[source]

get train steps and epochs

Parameters:
  • max_steps (int) – max steps
  • num_train_epochs (int) – num epochs
  • gradient_accumulation_steps (int) – gradient accumulation steps
  • num_samples (int) – number of samples
Returns:

total steps, number of epochs

Return type:

Tuple

classmethod load_model(model_path: str, model_type: str, *args, **kwargs)[source]

Create a TranformerBase deom from given path

Parameters:
  • model_path (str) – path to model
  • model_type (str) – model type
Returns:

model

Return type:

TransformerBase

optimizer
save_model(output_dir: str, save_checkpoint: bool = False, args=None)[source]

Save model/tokenizer/arguments to given output directory

Parameters:
  • output_dir (str) – path to output directory
  • save_checkpoint (bool, optional) – save as checkpoint. Defaults to False.
  • args ([type], optional) – arguments object to save. Defaults to None.
save_model_checkpoint(output_path: str, name: str)[source]

save model checkpoint

Parameters:
  • output_path (str) – output path
  • name (str) – name of checkpoint
scheduler
setup_default_optimizer(weight_decay: float = 0.0, learning_rate: float = 5e-05, adam_epsilon: float = 1e-08, warmup_steps: int = 0, total_steps: int = 0)[source]
to(device='cpu', n_gpus=0)[source]
nlp_architect.models.transformers.base_model.get_models(models: List[str])[source]

nlp_architect.models.transformers.quantized_bert module

Quantized BERT layers and model

class nlp_architect.models.transformers.quantized_bert.QuantizedBertAttention(config)[source]

Bases: pytorch_transformers.modeling_bert.BertAttention

prune_heads(heads)[source]
class nlp_architect.models.transformers.quantized_bert.QuantizedBertConfig(vocab_size_or_config_json_file=30522, hidden_size=768, num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072, hidden_act='gelu', hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1, max_position_embeddings=512, type_vocab_size=2, initializer_range=0.02, layer_norm_eps=1e-12, **kwargs)[source]

Bases: pytorch_transformers.modeling_bert.BertConfig

pretrained_config_archive_map = {'bert-base-uncased': 'https://nlp-architect-data.s3-us-west-2.amazonaws.com/models/transformers/bert-base-uncased.json', 'bert-large-uncased': 'https://nlp-architect-data.s3-us-west-2.amazonaws.com/models/transformers/bert-large-uncased.json'}
class nlp_architect.models.transformers.quantized_bert.QuantizedBertEmbeddings(config)[source]

Bases: pytorch_transformers.modeling_bert.BertEmbeddings

class nlp_architect.models.transformers.quantized_bert.QuantizedBertEncoder(config)[source]

Bases: pytorch_transformers.modeling_bert.BertEncoder

class nlp_architect.models.transformers.quantized_bert.QuantizedBertForQuestionAnswering(config)[source]

Bases: nlp_architect.models.transformers.quantized_bert.QuantizedBertPreTrainedModel, pytorch_transformers.modeling_bert.BertForQuestionAnswering

class nlp_architect.models.transformers.quantized_bert.QuantizedBertForSequenceClassification(config)[source]

Bases: nlp_architect.models.transformers.quantized_bert.QuantizedBertPreTrainedModel, pytorch_transformers.modeling_bert.BertForSequenceClassification

class nlp_architect.models.transformers.quantized_bert.QuantizedBertForTokenClassification(config)[source]

Bases: nlp_architect.models.transformers.quantized_bert.QuantizedBertPreTrainedModel, pytorch_transformers.modeling_bert.BertForTokenClassification

class nlp_architect.models.transformers.quantized_bert.QuantizedBertIntermediate(config)[source]

Bases: pytorch_transformers.modeling_bert.BertIntermediate

class nlp_architect.models.transformers.quantized_bert.QuantizedBertLayer(config)[source]

Bases: pytorch_transformers.modeling_bert.BertLayer

class nlp_architect.models.transformers.quantized_bert.QuantizedBertModel(config)[source]

Bases: nlp_architect.models.transformers.quantized_bert.QuantizedBertPreTrainedModel, pytorch_transformers.modeling_bert.BertModel

class nlp_architect.models.transformers.quantized_bert.QuantizedBertOutput(config)[source]

Bases: pytorch_transformers.modeling_bert.BertOutput

class nlp_architect.models.transformers.quantized_bert.QuantizedBertPooler(config)[source]

Bases: pytorch_transformers.modeling_bert.BertPooler

class nlp_architect.models.transformers.quantized_bert.QuantizedBertPreTrainedModel(*inputs, **kwargs)[source]

Bases: pytorch_transformers.modeling_bert.BertPreTrainedModel

base_model_prefix = 'quant_bert'
config_class

alias of QuantizedBertConfig

init_weights(module)[source]

Initialize the weights.

class nlp_architect.models.transformers.quantized_bert.QuantizedBertSelfAttention(config)[source]

Bases: pytorch_transformers.modeling_bert.BertSelfAttention

class nlp_architect.models.transformers.quantized_bert.QuantizedBertSelfOutput(config)[source]

Bases: pytorch_transformers.modeling_bert.BertSelfOutput

nlp_architect.models.transformers.quantized_bert.quantized_embedding_setup(config, name, *args, **kwargs)[source]

Get QuantizedEmbedding layer according to config params

nlp_architect.models.transformers.quantized_bert.quantized_linear_setup(config, name, *args, **kwargs)[source]

Get QuantizedLinear layer according to config params

nlp_architect.models.transformers.sequence_classification module

class nlp_architect.models.transformers.sequence_classification.TransformerSequenceClassifier(model_type: str, labels: List[str] = None, task_type='classification', metric_fn=<function accuracy>, *args, **kwargs)[source]

Bases: nlp_architect.models.transformers.base_model.TransformerBase

Transformer sequence classifier

Parameters:
  • model_type (str) – transformer base model type
  • labels (List[str], optional) – list of labels. Defaults to None.
  • task_type (str, optional) – task type (classification/regression). Defaults to
  • classification.
  • metric_fn ([type], optional) – metric to use for evaluation. Defaults to
  • simple_accuracy.
MODEL_CLASS = {'bert': <class 'pytorch_transformers.modeling_bert.BertForSequenceClassification'>, 'quant_bert': <class 'nlp_architect.models.transformers.quantized_bert.QuantizedBertForSequenceClassification'>, 'xlm': <class 'pytorch_transformers.modeling_xlm.XLMForSequenceClassification'>, 'xlnet': <class 'pytorch_transformers.modeling_xlnet.XLNetForSequenceClassification'>}
convert_to_tensors(examples: List[nlp_architect.data.sequence_classification.SequenceClsInputExample], max_seq_length: int = 128, include_labels: bool = True) → torch.utils.data.dataset.TensorDataset[source]

Convert examples to tensor dataset

Parameters:
  • examples (List[SequenceClsInputExample]) – examples
  • max_seq_length (int, optional) – max sequence length. Defaults to 128.
  • include_labels (bool, optional) – include labels. Defaults to True.
Returns:

Return type:

TensorDataset

evaluate_predictions(logits, label_ids)[source]

Run evaluation of given logits and truth labels

Parameters:
  • logits – model logits
  • label_ids – truth label ids
inference(examples: List[nlp_architect.data.sequence_classification.SequenceClsInputExample], batch_size: int = 64, evaluate=False)[source]

Run inference on given examples

Parameters:
Returns:

logits

train(train_data_set: torch.utils.data.dataloader.DataLoader, dev_data_set: Union[torch.utils.data.dataloader.DataLoader, List[torch.utils.data.dataloader.DataLoader]] = None, test_data_set: Union[torch.utils.data.dataloader.DataLoader, List[torch.utils.data.dataloader.DataLoader]] = None, gradient_accumulation_steps: int = 1, per_gpu_train_batch_size: int = 8, max_steps: int = -1, num_train_epochs: int = 3, max_grad_norm: float = 1.0, logging_steps: int = 50, save_steps: int = 100)[source]

Train a model

Parameters:
  • train_data_set (DataLoader) – training data set
  • dev_data_set (Union[DataLoader, List[DataLoader]], optional) – development set.
  • to None. (Defaults) –
  • test_data_set (Union[DataLoader, List[DataLoader]], optional) – test set.
  • to None.
  • gradient_accumulation_steps (int, optional) – num of gradient accumulation steps.
  • to 1. (Defaults) –
  • per_gpu_train_batch_size (int, optional) – per GPU train batch size. Defaults to 8.
  • max_steps (int, optional) – max steps. Defaults to -1.
  • num_train_epochs (int, optional) – number of train epochs. Defaults to 3.
  • max_grad_norm (float, optional) – max gradient normalization. Defaults to 1.0.
  • logging_steps (int, optional) – number of steps between logging. Defaults to 50.
  • save_steps (int, optional) – number of steps between model save. Defaults to 100.

nlp_architect.models.transformers.token_classification module

class nlp_architect.models.transformers.token_classification.BertForTagging(config)[source]

Bases: pytorch_transformers.modeling_bert.BertForTokenClassification

BERT token classification head with linear classifier.

The forward requires an additional ‘valid_ids’ map that maps the tensors for valid tokens (e.g., ignores additional word piece tokens generated by the tokenizer, as in NER task the ‘X’ label).

forward(input_ids, token_type_ids=None, attention_mask=None, labels=None, position_ids=None, head_mask=None, valid_ids=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class nlp_architect.models.transformers.token_classification.QuantizedBertForNERLinear(config)[source]

Bases: nlp_architect.models.transformers.quantized_bert.QuantizedBertForTokenClassification

Quantized BERT token classification head with linear classifier.

The forward requires an additional ‘valid_ids’ map that maps the tensors for valid tokens (e.g., ignores additional word piece tokens generated by the tokenizer, as in NER task the ‘X’ label).

forward(input_ids, token_type_ids=None, attention_mask=None, labels=None, position_ids=None, head_mask=None, valid_ids=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class nlp_architect.models.transformers.token_classification.TransformerTokenClassifier(model_type: str, labels: List[str] = None, *args, **kwargs)[source]

Bases: nlp_architect.models.transformers.base_model.TransformerBase

Transformer word tagging classifier

Parameters:
  • model_type (str) – model family (classifier head), choose between bert/quant_bert/xlnet
  • labels (List[str], optional) – list of tag labels
MODEL_CLASS = {'bert': <class 'nlp_architect.models.transformers.token_classification.BertForTagging'>, 'quant_bert': <class 'nlp_architect.models.transformers.token_classification.QuantizedBertForNERLinear'>, 'xlnet': <class 'nlp_architect.models.transformers.token_classification.XLNetForTokenClassification'>}
convert_to_tensors(examples: List[nlp_architect.data.sequential_tagging.TokenClsInputExample], max_seq_length: int = 128, include_labels: bool = True) → torch.utils.data.dataset.TensorDataset[source]

Convert examples to tensor dataset

Parameters:
  • examples (List[SequenceClsInputExample]) – examples
  • max_seq_length (int, optional) – max sequence length. Defaults to 128.
  • include_labels (bool, optional) – include labels. Defaults to True.
Returns:

Return type:

TensorDataset

evaluate_predictions(logits, label_ids)[source]

Run evaluation of given logist and truth labels

Parameters:
  • logits – model logits
  • label_ids – truth label ids
static extract_labels(label_ids, label_map, logits)[source]
inference(examples: List[nlp_architect.data.sequential_tagging.TokenClsInputExample], batch_size: int = 64)[source]

Run inference on given examples

Parameters:
Returns:

logits

train(train_data_set: torch.utils.data.dataloader.DataLoader, dev_data_set: Union[torch.utils.data.dataloader.DataLoader, List[torch.utils.data.dataloader.DataLoader]] = None, test_data_set: Union[torch.utils.data.dataloader.DataLoader, List[torch.utils.data.dataloader.DataLoader]] = None, gradient_accumulation_steps: int = 1, per_gpu_train_batch_size: int = 8, max_steps: int = -1, num_train_epochs: int = 3, max_grad_norm: float = 1.0, logging_steps: int = 50, save_steps: int = 100)[source]

Run model training

Parameters:
  • train_data_set (DataLoader) – training dataset
  • dev_data_set (Union[DataLoader, List[DataLoader]], optional) – development data set
  • be list) Defaults to None. ((can) –
  • test_data_set (Union[DataLoader, List[DataLoader]], optional) – test data set
  • be list) Defaults to None.
  • gradient_accumulation_steps (int, optional) – gradient accumulation steps.
  • to 1. (Defaults) –
  • per_gpu_train_batch_size (int, optional) – per GPU train batch size (or GPU).
  • to 8. (Defaults) –
  • max_steps (int, optional) – max steps for training. Defaults to -1.
  • num_train_epochs (int, optional) – number of training epochs. Defaults to 3.
  • max_grad_norm (float, optional) – max gradient norm. Defaults to 1.0.
  • logging_steps (int, optional) – number of steps between logging. Defaults to 50.
  • save_steps (int, optional) – number of steps between model save. Defaults to 100.
class nlp_architect.models.transformers.token_classification.XLNetForTokenClassification(config)[source]

Bases: pytorch_transformers.modeling_xlnet.XLNetPreTrainedModel

forward(input_ids, token_type_ids=None, input_mask=None, attention_mask=None, mems=None, perm_mask=None, target_mapping=None, labels=None, head_mask=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

nlp_architect.models.transformers.token_classification.bert_for_ner_linear_forward(bert, input_ids, token_type_ids=None, attention_mask=None, labels=None, position_ids=None, head_mask=None, valid_ids=None)[source]

Module contents